O(mn log σ) Time Transposition Invariant LCS Computation

نویسندگان

  • Szymon Grabowski
  • Gonzalo Navarro
چکیده

Given strings A and B of lengths m and n over a finite alphabet Σ ⊂ Z of size O(σ), the length of the longest common transposition invariant subsequence is LCTS(A,B) = maxt∈Z{LCS(A+ t, B)}, where A + t = (a1 + t)(a2 + t) . . . (am + t) and LCS(A + t, B) is the length of the longest common subsequence between A+ t and B. LCTS(A,B) can be computed naively in O(mnσ) time. We present a simple and easy to implement algorithm obtaining O(mn log σ) time. We also show that transposition invariant Levenshtein distance can be computed in O(mn √ σ) time.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Speeding up transposition-invariant string matching

Finding the longest common subsequence (LCS) of two given sequences A = a0a1 . . . am−1 and B = b0b1 . . . bn−1 is an important and well studied problem. We consider its generalization, transposition-invariant LCS (LCTS), which has recently arisen in the field of music information retrieval. In LCTS, we look for the longest common subsequence between the sequences A + t = (a0 + t)(a1 + t) . . ....

متن کامل

Bit-Parallel Branch and Bound Algorithm for Transposition Invariant LCS

Main Results. We consider the problem of longest common subsequence (LCS) of two given strings in the case where the first may be shifted by some constant (i.e. transposed) to match the second. For this longest common transposition invariant subsequence (LCTS) problem, that has applications for instance in music comparison, we develop a branch and bound algorithm with best case time O((m + log ...

متن کامل

Improved Time and Space Complexities for Transposition Invariant String Matching

Given strings A = a1a2 . . . am and B = b1b2 . . . bn over a finite alphabet Σ ⊂ Z of size O(σ), and a distance d() defined among strings, the transposition invariant version of d() is d t(A,B) = mint∈Z d(A+t, B), where A+t = (a1+t)(a2+t) . . . (am+t). Distances d() of most interest are Levenshtein distance and indel distance (the dual of the Longest Common Subsequence), which can be computed i...

متن کامل

Restricted Transposition Invariant Approximate String Matching Under Edit Distance

Let A and B be strings with lengths m and n, respectively, over a finite integer alphabet. Two classic string mathing problems are computing the edit distance between A and B, and searching for approximate occurrences of A inside B. We consider the classic Levenshtein distance, but the discussion is applicable also to indel distance. A relatively new variant [8] of string matching, motivated in...

متن کامل

SIA(M)ESE: An Algorithm for Transposition Invariant, Polyphonic Content-Based Music Retrieval SIA(M)ESE: An Algorithm for Transposition Invariant, Polyphonic Content-Based Music Retrieval

We introduce a novel algorithm for transposition-invariant contentbased polyphonic music retrieval. Our SIA(M)ESE algorithm is capable of finding transposition invariant occurrences of a given template, in a database of polyphonic music called a dataset. We allow arbitrary gapping, i.e., between musical events in the dataset that have been found to match points in the template, there may be any...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008